# Check if Biocondutor is installed
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
# Install knitr
if (!requireNamespace("knitr", quietly = TRUE))
install.packages("knitr")
if (!requireNamespace("GEOquery", quietly = TRUE))
BiocManager::install("GEOquery")
# Install edgeR
if (!requireNamespace("edgeR", quietly = TRUE))
BiocManager::install("edgeR")
# Install gprofiler2
if (!requireNamespace("gprofiler2", quietly = TRUE))
BiocManager::install("gprofiler2")
# Install RCy3
if (!requireNamespace("RCy3", quietly = TRUE))
BiocManager::install("RCy3")
importObj <- list("import/filtered_data_matrix.Rda",
"import/normalized_counts.Rda",
"import/rnaseq_df_HGNC.Rda",
"import/samples_info_dt.Rda")
for (item in importObj){load(item)}
library(knitr)
library(RCy3)
executables <- paste0(getwd(), "/executables")
unzip_gsea <- function() {
# Unzip the GSEA zip file
unzip(
zipfile = file.path(executables, "GSEA_4.3.3.zip"),
exdir = executables
)
# Define the path to the GSEA executable
gsea_executable <- file.path(executables, "GSEA_4.3.3", "gsea-cli.sh")
# Make the GSEA executable file executable (i.e., chmod +x)
system(paste("chmod +x", shQuote(gsea_executable)))
# Return the path to the GSEA executable
return(gsea_executable)
}
# Check if GSEA folder already exists in the downloads directory
if (!dir.exists(file.path(executables, "GSEA_4.3.3"))) {
# Call the function to download and unzip GSEA if necessary
gsea_executable <- unzip_gsea()
} else {
# Define the path to the GSEA executable
gsea_executable <- file.path(executables, "GSEA_4.3.3", "gsea-cli.sh")
}
The paper associated with my dataset, Immunoproteasome function maintains oncogenic gene expression in KMT2A-complex driven leukemia, investigates novel therapeutic strategies for KMT2A-rearranged acute myeloid leukemia (KMT2A-r AML) [pmc_10694946]. Current treatments involves pharmacologic targeting of chromatin-associated protein complexes. However, drug resistance remains one of the biggest issue with this treatment. The paper has identified the catalytic immunoproteasome subunit PSMB8 as a specific vulnerability in KMT2A-r AML. Targeting it enhances efficacy of Menin-inhibitors, suggesting a potential therapeutic strategy for this subtype of AML.
IMPORTANT DISCLAIMER: I was initially working with the filtered and normalized datasets from A1. However, it came to my attention that 2 of the samples (MI-503 Replicate #3 and DMSO Replicate #3) were not grouped properly in the MDS plot (figure below). I subsequently performed DGE analysis on the data and looked at the heatmap (figures below). It was especially evident in DMSO vs Combination where I noticed DMSO Replicate #3 incorrectly grouped with the Combinations samples. I checked my initial datasets and procedures conducted in A1 and find no errors. Thus, I suspect there might be mislabeling of the samples. With the advice of Professor Isserlin, I removed them as I re-do the filtering, normalization and DGE analysis.
The following figures show the problematic MDS plot and Heatmap Clustering (Gu and Eberl 2022), (Gu 2022):
Updated Overview:
Genes with low counts, as well as duplicate gene rows, were filtered out from the data. Then, I normalize the dataset via trimmed mean of m-values (TMM). After removing the problematic samples, I re-do the BCV plot and Mean-Variance plot. They have similar curves to the ones performed in A1.
The following figure shows the difference in my dataset pre- and
post-filtering.
The following figure shows the difference in my dataset pre- and
post-normalization. This is done via edgeR (Gordon Smyth 2022)
I perform DGE Analysis for 2 comparison groups: DMSO vs
MI-503 and DMSO vs Combination. This analysis is performed
by fitting a quasi-likelihood negative binomial GLM to the gene
expression dataset using edgeR package in R. FDR was
performed via B-H correction to correct for multiple hypothesis testing,
with alpha threshold set to 0.05. The number of genes that
passed the test are presented below and the results are visualized in
the following MA-plots for both comparisons.
| Comparison Group | # of genes that have p-val < .05 | # of genes that pass B-H correction |
|---|---|---|
| DMSO vs MI-503 | 7824 | 7300 |
| DMSO vs Combination | 8298 | 7800 |
IMPORTANT DISCLAIMER: Hereafter, my analysis will be focused on DMSO vs Combination, since that is the treatment of interest in the associated article (Tubı́o-Santamarı́a et al. 2023). G:Profiler was configured to perform multiple testing correction by applying FDR and excluding electronic annotations. Pathway sources used include REAC, WP and GO:BP. ORA is done for the up-regulated gene set,down-regulated gene set and all differentially expressed gene set.
Based on the key terms in the ORA for DMSO vs Combination, the DEGs and up-regulated genes share some common pathways and GO terms, which are related to the gene expression pathway and cell cycle. These are expected since the article (Tubı́o-Santamarı́a et al. 2023) is looking for gene targets that reduce the proliferation of AML (cancer). Interestingly, the down-regulated genes have terms that are more different from the DEGs and up-regulated genes. Proteasome degradation term was enriched in WikiPathways, which supports the article that reduction in proteasome helps reduce proliferation of AML.
- What method did you use? What genesets did you use? Make sure to specify versions and cite your methods.
In order to perform GSEA, we will need to create a rank file using
this equation:
\[\text{Rank} = -\log_{10}(p) \times
\text{sign}(logFC)\] where \(p\)
is the p-value of the gene and \(logFC\) is the log fold change of the gene.
We will using the gene set
Human_GOBP_AllPathways_noPFOCR_no_GO_iea_March_01_2024_symbol.gmt
from Bader Lab, which excludes electronically inferred annotations (EIA)
and pathway figure optical character recognition (PFOCR). This gene set
is last updated on March 1, 2024. The Analysis will be conducted via
GSEA CLI, version 4.3.3.
load("qlf_outputHits2_HGNC.rda")
# Create a data frame to store the ranks
rownames(qlf_outputHits2_HGNC) <- qlf_outputHits2_HGNC$gene
rank_df <- data.frame(
GeneName = rownames(qlf_outputHits2_HGNC),
Rank = -log10(qlf_outputHits2_HGNC$PValue) * sign(qlf_outputHits2_HGNC$logFC)
)
# Save the ranks to a file
rank_file <- file.path("DMSO_vs_Combination_rank.rnk")
write.table(rank_df, rank_file, sep = "\t", row.names = FALSE, quote = FALSE)
library(RCurl)
# Name the GSEA analysis
gsea_analysis_name <- "DMSO_vs_Combination"
# GSEA Output directory
gsea_output_dir <- file.path(".", "gsea_output")
# Define the input rank file
rank_file <- file.path(".", "DMSO_vs_Combination_rank.rnk")
download_gmt_file <- function(species) {
# URL to bader lab's gene set
URL <- "https://download.baderlab.org/EM_Genesets/March_01_2024/Human/symbol/Human_GO_AllPathways_noPFOCR_no_GO_iea_March_01_2024_symbol.gmt"
gmt_file <- "Human_GO_AllPathways_noPFOCR_no_GO_iea_March_01_2024_symbol.gmt"
# Download the file
download.file(
url = URL,
destfile = file.path(gmt_file)
)
# Return the name of the downloaded file
return(gmt_file)
}
# Download the gene set file
gmt_file <- file.path(download_gmt_file("Human"))
# Define the GSEA command
command <- sprintf(
"%s GSEAPreRanked -gmx %s -rnk %s -collapse false -nperm 1000 -scoring_scheme weighted -rpt_label %s -plot_top_x 20 -rnd_seed 12345 -set_max 200 -set_min 15 -zip_report false -out %s > gsea_output.txt",
shQuote(gsea_executable),
shQuote(gmt_file),
shQuote(rank_file),
shQuote(gsea_analysis_name),
shQuote(gsea_output_dir)
)
Now, running GSEA.
# Check if results already exist
gsea_result_folder <- list.files(gsea_output_dir)
if (length(gsea_result_folder) == 0) {
# Run the GSEA command
system(command)
}
# Get a list of the TSV files in the GSEA results directory
gsea_files <- list.files(paste(gsea_output_dir, gsea_result_folder, sep = "/"))
gsea_files <- gsea_files[grep(".tsv", gsea_files)]
# Get a list of the TSV files in the GSEA results directory
gsea_output_dir <- file.path("gsea_output")
gsea_result_folder <- file.path("DMSO_vs_Combination.GseaPreranked.1712414521489")
gsea_files <- list.files(paste(gsea_output_dir, gsea_result_folder, sep = "/"))
gsea_files <- gsea_files[grep(".tsv", gsea_files)]
# Find the overall reports of the GSEA
rx <- gregexpr("gsea_report_for_.*\\.tsv", gsea_files)
gsea_report_files <- gsea_files[unlist(rx) > 0]
na_pos_file <- gsea_report_files[grep("na_pos", gsea_report_files)]
na_neg_file <- gsea_report_files[grep("na_neg", gsea_report_files)]
# Load the GSEA results (.tsv) into a data frame
na_pos_report <- read.delim(
file.path(gsea_output_dir, gsea_result_folder, na_pos_file),
sep = "\t"
)
na_neg_report <- read.delim(
file.path(gsea_output_dir, gsea_result_folder, na_neg_file),
sep = "\t"
)
# Number of upregulated gene sets for both conditions
num_upreg_pos <- nrow(na_pos_report)
num_upreg_neg <- nrow(na_neg_report)
total_genesets <- num_upreg_pos + num_upreg_neg
# Number of gene sets with FDR < 5%
num_upreg_pos_fdr <- sum(na_pos_report$FDR < 0.05)
num_upreg_neg_fdr <- sum(na_neg_report$FDR < 0.05)
# Number of gene sets with pvalue < 1%
#num_upreg_pos_pval <- sum(na_pos_report$NOM.p.val < 0.01)
#num_upreg_neg_pval <- sum(na_neg_report$NOM.p.val < 0.01)
# Number of gene sets with pvalue < 5%
num_upreg_pos_pval_5 <- sum(na_pos_report$NOM.p.val < 0.05)
num_upreg_neg_pval_5 <- sum(na_neg_report$NOM.p.val < 0.05)
- Summarize your enrichment results.
Parameters used to run this GSEA Analysis: * only gene sets that has 15 to 200 genes are considered
Out of the 4411 gene sets considered, there are 2882 gene sets that that were enriched corresponding to genes that have increased expression change when comparing Combination to DMSO. When controlling for FDR < 0.05, 1240 gene sets are considered enriched.
In contrast, there are 3189 gene sets that were enriched corresponding to genes that have decreased expression change when comparing Combination to DMSO. When controlling for FDR < 0.05, 544 gene sets are considered enriched (considerably less).
I have presented the top 5 enriched gene sets in Combination treatment and DMSO treatment, respectively.
# Create a data frame to store the summary
gsea_summary <- data.frame(
Condition = c("Combination", "DMSO"),
NumGeneSets = c(num_upreg_pos, num_upreg_neg),
NumGeneSetsFDR = c(num_upreg_pos_fdr, num_upreg_neg_fdr),
NumGeneSetsPVal5 = c(num_upreg_pos_pval_5, num_upreg_neg_pval_5)
)
# Update the column names
colnames(gsea_summary) <- c("Condition", "Total Gene Sets", "Gene Sets (FDR < 5%)", "Gene Sets (p-value < 5%)")
# Display the summary
knitr::kable(
gsea_summary,
caption = "Summary of the GSEA results for DMSO vs Combination (MI-503 + PR-957)"
)
| Condition | Total Gene Sets | Gene Sets (FDR < 5%) | Gene Sets (p-value < 5%) |
|---|---|---|---|
| Combination | 2882 | 1240 | 1396 |
| DMSO | 3189 | 544 | 1074 |
Top 5 enriched gene sets in Combination Treatment vs DMSO Treatment
table_cols <- c("NAME", "ES", "NES", "FDR.q.val", "LEADING.EDGE")
kable(
na_pos_report[1:5, table_cols],
type = "html",
caption =
"Table 1. Top 5 NES GSEA Enriched Gene Sets expressed in Combination Treatment"
)
| NAME | ES | NES | FDR.q.val | LEADING.EDGE |
|---|---|---|---|---|
| HALLMARK_E2F_TARGETS%MSIGDBHALLMARK%HALLMARK_E2F_TARGETS | 0.7683635 | 3.483326 | 0 | tags=82%, list=18%, signal=99% |
| HALLMARK_G2M_CHECKPOINT%MSIGDBHALLMARK%HALLMARK_G2M_CHECKPOINT | 0.7013476 | 3.131183 | 0 | tags=76%, list=21%, signal=95% |
| HALLMARK_MYC_TARGETS_V2%MSIGDBHALLMARK%HALLMARK_MYC_TARGETS_V2 | 0.8115134 | 3.067383 | 0 | tags=83%, list=12%, signal=94% |
| HALLMARK_MYC_TARGETS_V1%MSIGDBHALLMARK%HALLMARK_MYC_TARGETS_V1 | 0.6843427 | 3.062547 | 0 | tags=72%, list=19%, signal=88% |
| MITOCHONDRIAL GENE EXPRESSION%GOBP%GO:0140053 | 0.6721708 | 2.987693 | 0 | tags=72%, list=25%, signal=95% |
kable(
na_neg_report[1:5, table_cols],
type = "html",
caption =
"Table 1. Top 5 NES GSEA Enriched Gene Sets expressed in DMSO Treatment"
)
| NAME | ES | NES | FDR.q.val | LEADING.EDGE |
|---|---|---|---|---|
| ABC TRANSPORTER DISORDERS%REACTOME DATABASE ID RELEASE 65%5619084 | -0.6571340 | -2.560738 | 0 | tags=53%, list=16%, signal=63% |
| PROTEASOME ACCESSORY COMPLEX%GOCC%GO:0022624 | -0.8382887 | -2.555173 | 0 | tags=68%, list=5%, signal=72% |
| PROTEASOME REGULATORY PARTICLE%GOCC%GO:0005838 | -0.8382887 | -2.514202 | 0 | tags=68%, list=5%, signal=72% |
| DEGRADATION OF THE EXTRACELLULAR MATRIX%REACTOME DATABASE ID RELEASE 65%1474228 | -0.6584710 | -2.488723 | 0 | tags=66%, list=21%, signal=83% |
| LYSOSOMAL LUMEN%GOCC%GO:0043202 | -0.6414781 | -2.484072 | 0 | tags=65%, list=19%, signal=80% |
- How do these results compare to the results from the thresholded analysis in Assignment #2. Compare qualitatively. Is this a straight forward comparison? Why or why not?
As a refresher, below are the signficantly-enriched terms found in the G:Profiler result for DMSO vs Combination.
Comparing the results of ORA and GSEA isn’t exactly straightforward. The input for ORA is the significant genes (up-regulated and down-regulated) after the differential expression analysis. The purpose is to identify whether a particular gene set is overrepresented in a list of genes. On the other hand, the input for GSEA is all the genes that have been ranked. The purpose is to determine whether a gene set shows significant differences between 2 states (Combination vs DMSO). GSEA results will be ranked by the normalized enrichment score while G:Profiler will be ordered by the adjusted p-value. Thus, results may be similar but ranked completely differently. We shouldn’t compare the results directly.
That being said, the results share similar themes.The GSEA positive ranked enriched gene sets and the positive fold change G:Profiler results involves common themes like gene expression regulation, such as Myc (Salvatori et al. 2011), and cell cycle checkpoint. On the other hand, the GSEA negative ranked enriched gene sets and the negative fold change G:Profiler results involves common themes like proteasome degradation. While the gene sets are not necessarily 1 to 1, these results matches the findings of the article associated with the dataset.
# Connect to Cytoscape
cyrest_url <- "http://localhost:1234/v1"
cytoscapePing(cyrest_url)
cytoscapeVersionInfo(cyrest_url)
apps <- c("enrichmentmap", "autoannotate")
installed_apps <- RCy3::getInstalledApps(base.url = cyrest_url)
new_apps <- c()
for (app in apps){
RCy3::installApp(app, base.url = cyrest_url)
new_apps <- c(new_apps, app)
}
enrichment_results <- file.path(
gsea_output_dir, gsea_result_folder, "edb", "results.edb")
write.table(normalized_counts, file="normalized_counts.txt", sep = "\t",
quote=FALSE)
write.table(normalized_counts_Combination_vs_DMSO,
file="normalized_counts_Combination_vs_DMSO.txt", sep="\t",
quote=FALSE)
expression_data <- file.path("normalized_counts.txt")
upload_data_to_cytoscape <- function(cyrest_url, files) {
# Store the uploaded file paths
uploaded_paths <- c()
for (file in files) {
# Build the POST request
bname <- basename(file)
r <- httr::POST(
url = paste0(
cyrest_url, "/enrichmentmap/textfileupload?fileName=", bname
),
config = list(),
body = list(file = httr::upload_file(file)),
encode = "multipart",
handle = NULL
)
# Check if the upload was successful
uploaded_paths <- c(uploaded_paths, httr::content(r, "parsed")$path)
}
return(uploaded_paths)
}
# Upload the required files to Cytoscape
em_files <- c(gmt_file, enrichment_results, expression_data, rank_file)
uploaded_files <- upload_data_to_cytoscape("http://localhost:1234", em_files)
# Store the host file paths
gmt_file_host <- uploaded_files[1]
enrichment_results_host <- uploaded_files[2]
expression_data_host <- uploaded_files[3]
rank_file_host <- uploaded_files[4]
# Define Analysis Name
em_analysis_name <- "Combination_vs_DMSO"
# Define the p-value threshold
pvalue_gsea_threshold <- 1.0
# Define the q-value threshold
qvalue_gsea_threshold <- 0.05
# Define the similarity threshold
similarity_threshold <- "0.375"
# Output the thresholds
knitr::kable(
data.frame(
PValue = pvalue_gsea_threshold,
QValue = qvalue_gsea_threshold,
Similarity = similarity_threshold
),
caption = "Thresholds used for the Enrichment Map analysis."
)
| PValue | QValue | Similarity |
|---|---|---|
| 1 | 0.05 | 0.375 |
#similarity metric to filter all the genesets connections/edges
# (can be OVERLAP, JACCARD, or COMBINED)
similarity_metric = "COMBINED"
# Function to create an enrichment map using the provided cyrest_url
#
# Args:
# cur_model_name: The name of the current model
# pvalue_gsea_threshold: The p-value threshold for the GSEA analysis
# qvalue_gsea_threshold: The q-value threshold for the GSEA analysis
# similarity_threshold: The similarity threshold for the enrichment map
# similarity_metric: The similarity metric for the enrichment map
# gsea_ranks_file: The file containing the GSEA ranks
# gsea_results_filename: The file containing the GSEA results
# expression_file_fullpath: The full path to the expression data file
# gmt_gsea_file: The file containing the GSEA gene sets
# current_base: The base URL of the Cytoscape REST API
#
# Returns:
# The response from the Cytoscape REST API
create_enrichment_map <- function(
cur_model_name, pvalue_gsea_threshold, qvalue_gsea_threshold,
similarity_threshold, similarity_metric, gsea_ranks_file,
gsea_results_filename, expression_file_fullpath, gmt_gsea_file,
current_base
) {
# Construct the network name
current_network_name <- paste(
cur_model_name,
pvalue_gsea_threshold,
qvalue_gsea_threshold,
sep = "_"
)
# Construct the EM command
em_command <- paste(
'enrichmentmap build analysisType="gsea"',
"gmtFile=", gmt_gsea_file,
"pvalue=", pvalue_gsea_threshold,
"qvalue=", qvalue_gsea_threshold,
"similaritycutoff=", similarity_threshold,
"coefficients=", similarity_metric,
"ranksDataset1=", gsea_ranks_file,
"enrichmentsDataset1=", gsea_results_filename,
"filterByExpressions=false",
"expressionDataset1=", expression_file_fullpath,
"gmtFile=", gmt_gsea_file,
sep = " "
)
# Execute the EM command
response <- RCy3::commandsGET(em_command, base.url = current_base)
# Initialize network SUID
current_network_suid <- 0
# Check if the command execution failed
if (grepl(pattern = "Failed", response)) {
return(paste(response))
} else {
current_network_suid <- response
}
# Check if the network name is unique
current_names <- RCy3::getNetworkList(base.url = current_base)
# If the name already exists, prepend the SUID to the name
if (current_network_name %in% current_names) {
current_network_name <- paste(
current_network_suid,
current_network_name,
sep = "_"
)
}
# Rename the network
response <- RCy3::renameNetwork(
title = current_network_name,
network = as.numeric(current_network_suid),
base.url = current_base
)
return(response)
}
# 1. Initial EM (p = 1.0, q = 0.05)
em_response <- create_enrichment_map(
em_analysis_name, pvalue_gsea_threshold, qvalue_gsea_threshold,
similarity_threshold, similarity_metric, rank_file_host,
enrichment_results_host, expression_data_host, gmt_file_host,
cyrest_url
)
em_response_suid <- em_response$network
- Create an enrichment map - how many nodes and how many edges in the resulting map? What thresholds were used to create this map? Make sure to record all thresholds. Include a screenshot of your network prior to manual layout.
Unmodified initial enrichment
map:
This enrichment map was created using Cytoscape app EnrichmentMap (Merico et al. 2010).
# Function to annotate the current enrichment map using the provided cyrest_url
#
# Args:
# current_base: The base URL of the Cytoscape REST API
#
# Returns:
# The response from the Cytoscape REST API
annotate_enrichment_map <- function(current_base) {
# Annotate the network
response <- RCy3::commandsGET(
paste(
"autoannotate annotate-clusterBoosted",
"network=current",
sep = " "
),
base.url = current_base
)
return(response)
}
# Function to summarize the current enrichment map using the provided cyrest_url
#
# Args:
# current_base: The base URL of the Cytoscape REST API
#
# Returns:
# The response from the Cytoscape REST API
summarize_enrichment_map <- function(current_base) {
# Annotate the network
response <- RCy3::commandsGET(
paste(
"autoannotate summary",
"network=current",
sep = " "
),
base.url = current_base
)
return(response)
}
# Annotate and summarize the initial network
annotated_network <- annotate_enrichment_map(cyrest_url)
# export_enrichment_map(
# cyrest_url,
# paste0(em_analysis_name, "_annotated_network.png"),
# em_output_dir
# )
annotated_network <- summarize_enrichment_map(cyrest_url)
annotated_network <- annotate_enrichment_map(cyrest_url)
# export_enrichment_map(
# cyrest_url,
# paste0(em_analysis_name, "_annotated_theme_network.svg"),
# em_output_dir
# )
- Annotate your network - what parameters did you use to annotate the network. If you are using the default parameters make sure to list them as well.
This annotated network is created via Cytoscape App AutoAnnotate (Kucera et al. 2016), which provides automatic pathway processes grouping.
The parameters are listed below (all default values):
- Make a publication ready figure - include this figure with proper legends in your notebook.
Publication Ready Enrichment Map:
- Collapse your network to a theme network. What are the major themes present in this analysis? Do they fit with the model? Are there any novel pathways or themes?
The major themes present in this analysis involve gene expression pathway and DNA repair pathway, as some of the key pathway nodes include sites replication initiation, mature sumoylation ubiquitinylation (which is involved in post-translation modification), and homologous recombination (which is involved in DNA repair). Meanwhile, the only major theme present in the DMSO treatment (blue node) is the kb hh degradation, which is involved in protein degradation via proteasome.
These themes fit with the model studied in the associated article. In the paper, the authors discovered that Combination (PR-957 + MI-503) treatment inactivates immunoproteasome subunit PSMB8, which in turn increases transcription factor BASP1. This means that there are increase in gene expression. There are some novel pathways/ themes that were not discussed in the paper, such as electron transport respiratory and macromolecule methylation ncrna. Electron transport respiratory is one of the targets for treatment of AML (Carter et al. 2020), while ncrna methylation is believed to be cancer hallmark in leukemia (Zhou et al. 2018).
- Do the enrichment results support conclusions or mechanism discussed in the original paper? How do these results differ from the results you got from Assignment #2 thresholded methods
In general, the enrichment results from GSEA and ORA do indeed support the mechanisms discussed in the original paper. In the control treatment, the protein degradation pathway are significantly enriched. The authors of the original paper describe the immunoproteasome subunit PSMB8 as responsible for the proliferation of KMT2r-AML. The Combination (PR957+MI503) treatment inactivates this immunoproteasome subunit and increases transcription factor BASP1. This is supported by the gene expression pathways (transcription, translations, and post-translation) being significantly enriched in GSEA.
Due to difference in methodology between GSEA and ORA, the significance levels of different gene sets varied vastly. Furthermore, it seems that the GSEA approach, i.e., enrichment analysis without thresholding, resulted in a broader set of genesets covering a variety of gene sets that may not have necessarily passed the threshold.
- Can you find evidence, i.e. publications, to support some of the results that you see. How does this evidence support your result?
E2F is a transcription factor considered to be a cancer hallmark for AML. Patients with AML usually have irregular E2F levels, as E2f regulates the expression of a number of genes important in cell proliferation, as well as DNA damage repair and development. Thus, an irregular E2f levels may result in uncontrolled cell proliferation, which is a distinct feature of cancer. With the Combination treatment, E2F gene set is significantly enriched, which suggest that this gene may be a potential therapeutic target. Though this transcription factor isn’t really discussed in the article, another article (Feng et al. 2020) investigates and support this result.
Gene sets involving proteasome complex and regulation are significantly enriched in the GSEA result expressed in DMSO treatment. This is discussed in both the original paper and supported by another article (Antony et al. 2021). Without the PR957+MI503 treatment, the proteasomes are not inhibited. The proteasomes degrade the Menin inhibitors, which were supposed to inhibit the proliferation of leukemic cells. This effectively reduces the efficacy of previous AML drugs that target menin as the AML cells can self-renew and proliferate.
Add a post analysis to your main network using specific transcription factors, microRNAs or drugs. Include the reason why you chose the specific miRs, TFs or drugs (i.e publications indicating that they might be related to your model). What does this post analysis show?
This
post-analysis was created using the Cytoscape App
EnrichmentApp. I added
Human_DrugBank_approved_symbol.gmt as the input file for
potential signature gene sets. This file is downloaded from Bader Lab.
Only the gene sets that pass the the Mann-Whitney Test (Two-Sided)
cutoff of 0.01 were selected (18 total).
These drugs were used for my post-analysis because I wanted to explore the drugs whose gene sets are significantly enriched in my data. Finding these drugs may help provide potential candidates for KMT2A-r AML treatment.
Two particular results caught my attention. NADPH pathway is
significantly enriched. This is supported in an article titled NADPH
metabolism determines the leukemogenic capacity and drug resistance of
AML cells (Chen et al. 2022).
NADH/NADPH is involve in the redox metabolism. The authors discover that
NADPH homeostasis regulation is crucial in determining leukemogenic
capacity and drug resistance in AML patients. This is also supported in
my GSEA results, where pathways related to
Mitchondrial Gene Expression is significantly enriched.
In addition, Fostamatinib is also significantly enriched in the post-analysis enrichment map. This is supported in an articel titled SYK inhibition targets acute myeloid leukemia stem cells by blocking their oxidative metabolism (Polak et al. 2020). Fostamatinib is a small-molecule SKY inhibitor. It targets leukemia stem cells by blocking their oxidative metabolism, thus presents itself as a potential drug target. However, there are not many researches that have been done on using Fostamatinib as drug treatment for AML yet.